feat: session transcripts for LLM KV cache stability by senamakel · Pull Request #512 · tinyhumansai/openhuman

senamakel · 2026-04-12T07:10:31Z

Summary

Session transcript persistence: Stores the exact Vec<ChatMessage> sent to the LLM provider as .md files at {workspace}/sessions/DDMMYYYY/{agent}_{index}.md. On session resume, the transcript is read back to produce byte-identical messages, ensuring the inference backend's KV cache prefix is reused rather than re-prefilled.
API usage/cache parsing: Parses the openhuman.usage and openhuman.billing metadata from API responses (including cached_input_tokens and charged_amount_usd) into UsageInfo, which was previously always None from the compatible provider.
Cache stats in transcript headers: Each session file includes cache hit percentage, token counts, and billing in the HTML comment header for debugging.
Sub-agent transcript support: Both typed and fork-mode sub-agents now write their own session transcripts.

Key files

File	Change
`src/openhuman/agent/harness/session/transcript.rs`	New — format parser/writer, path resolution, escaping
`src/openhuman/agent/harness/session/turn.rs`	Read hook at turn start, write hook at turn end, cached message usage
`src/openhuman/agent/harness/session/types.rs`	New fields: `session_transcript_path`, `cached_transcript_messages`, `agent_definition_name`
`src/openhuman/agent/harness/subagent_runner.rs`	Transcript persistence for sub-agents
`src/openhuman/providers/compatible.rs`	Parse `usage` + `openhuman` blocks from API response
`src/openhuman/providers/traits.rs`	`UsageInfo` gains `cached_input_tokens`, `charged_amount_usd`

Example transcript header

<!-- session_transcript
agent: main
dispatcher: native
cache_boundary: 1847
created: 2026-04-11T14:30:00Z
updated: 2026-04-11T14:35:22Z
turn_count: 3
input_tokens: 5000
output_tokens: 1200
cached_input_tokens: 3500
cache_hit_pct: 70.0%
charged_usd: $0.004500
-->

Test plan

cargo check — clean compilation (core + Tauri shell)
cargo test — all 2222 tests pass (0 failures)
12 dedicated transcript unit tests: round-trip fidelity, escaping, empty content, whitespace preservation, path resolution, meta parsing with cache stats
Manual: run agent session, verify {workspace}/sessions/ directory and transcript content
Manual: check cache_hit_pct in transcript header matches expected behavior

Summary by CodeRabbit

New Features
- Agents can carry human-readable definition names for transcript identification.
- Session transcripts persist to disk, can be resumed across runs, and record per-turn provider messages and aggregated usage.
- Usage metrics now include cached input tokens and per-request billing amounts for finer cost tracking.
Tests
- Comprehensive unit tests added to validate transcript read/write, escaping, path resolution, and resume behavior.

…e stability - Added support for session transcript persistence, allowing the exact `Vec<ChatMessage>` to be stored as human-readable `.md` files. - Introduced methods for loading previous transcripts to pre-populate messages for KV cache reuse, enhancing session continuity. - Updated `AgentBuilder` to include an agent definition name for transcript file naming. - Refactored `Agent` to handle transcript loading and persistence during session turns, ensuring byte-identical message handling across sessions. - Created a new `transcript.rs` module to encapsulate transcript-related functionality, improving code organization and maintainability.

…tadata - Added `usage` and `openhuman` fields to the `ApiChatResponse` struct to capture standard OpenAI usage metrics and OpenHuman backend metadata. - Introduced `ApiUsage`, `OpenHumanMeta`, and related structs to encapsulate detailed usage and billing information. - Updated `parse_native_response` and `extract_usage` methods to process and return usage data, improving the integration of usage tracking in chat responses. - Modified the `UsageInfo` struct in traits to include new fields for cached input tokens and charged amount, enhancing the overall usage reporting capabilities.

…der responses - Added usage extraction functionality to the `OpenAiCompatibleProvider`, enhancing the API response structure to include usage metrics. - Updated the `parse_native_response` method to utilize a new `wrap_message` function for improved testability and clarity in response handling. - Refactored tests to accommodate the new usage extraction logic, ensuring comprehensive coverage of the updated response structure.

…e tracking - Added functionality to persist sub-agent conversation transcripts, enabling inspection for debugging and KV cache analysis. - Introduced a new `persist_subagent_transcript` function to handle transcript writing, including metadata for input/output tokens and charged amounts. - Updated the `Agent` implementation to accumulate usage statistics across iterations and include them in the session transcript. - Enhanced the `write_transcript` function to include additional metadata fields, improving the detail of stored transcripts. - Refactored tests to validate the new transcript persistence and usage tracking features, ensuring comprehensive coverage.

- Reformatted the `write_transcript` and `read_transcript` functions for better clarity by adjusting line breaks and indentation. - Enhanced the `parse_meta` function to improve readability by restructuring the `get` closure. - Updated test cases to maintain consistency with the new formatting, ensuring clarity in the test structure. - Overall, these changes aim to enhance code maintainability and readability without altering functionality.

coderabbitai · 2026-04-12T07:10:47Z

📝 Walkthrough

Walkthrough

Adds session transcript persistence and resume: new transcript module for read/write/find/resolve, agent-level fields and builder setter for agent name, turn/subagent code to load and persist transcripts and aggregate usage, and provider UsageInfo extended with cached tokens and billing fields.

Changes

Cohort / File(s)	Summary
Session Transcript Module `src/openhuman/agent/harness/session/transcript.rs`, `src/openhuman/agent/harness/session/mod.rs`	New crate-visible `transcript` module implementing MD-based transcript read/write, resolve_new_transcript_path, find_latest_transcript, metadata types, escaping semantics, and comprehensive unit tests.
Agent Types & Builder `src/openhuman/agent/harness/session/types.rs`, `src/openhuman/agent/harness/session/builder.rs`	Added `agent_definition_name`, `session_transcript_path`, `cached_transcript_messages` to `Agent`; added `AgentBuilder::agent_definition_name(...)` setter; builder defaults agent_definition_name to `"main"` when unset and initializes transcript path/cached messages to `None`.
Session Turn Logic `src/openhuman/agent/harness/session/turn.rs`	On turn start, attempt resume by loading latest transcript when history empty; support one-time cached prefix consumption and delta-appending for provider messages; accumulate per-turn usage and persist session transcript after successful turns; added `try_load_session_transcript()` and `persist_session_transcript()`.
Subagent Runner Persistence `src/openhuman/agent/harness/subagent_runner.rs`	Inner loop now returns aggregated usage; typed/fork modes call `persist_subagent_transcript()` to write transcripts (best-effort) including cache boundary, token counts, and charged amount.
Provider Usage & Parsing `src/openhuman/providers/traits.rs`, `src/openhuman/providers/compatible.rs`	Added `UsageInfo.cached_input_tokens` and `charged_amount_usd` fields; extended `ApiChatResponse` parsing to include OpenHuman usage metadata; introduced `extract_usage()` and updated `parse_native_response` to return usage-aware `ProviderChatResponse`.
Test Initializers (no API change) `src/openhuman/context/guard.rs`, `src/openhuman/context/manager.rs`, `src/openhuman/context/pipeline.rs`	Unit tests updated to use `..Default::default()` for `UsageInfo` literals to fill unspecified fields.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent
    participant Transcript as Transcript Module
    participant FS as Filesystem
    participant Provider as Provider

    Note over Agent,Transcript: Turn start (resume attempt)
    Agent->>Agent: if history empty && cached_transcript_messages.is_none()
    Agent->>Transcript: find_latest_transcript(workspace, agent_name)
    Transcript->>FS: scan today's and yesterday's session dirs
    FS-->>Transcript: return latest path?
    alt path found
        Agent->>Transcript: read_transcript(path)
        Transcript->>FS: read & parse .md into messages + meta
        FS-->>Transcript: SessionTranscript
        Transcript-->>Agent: provide messages + meta
        Agent->>Agent: set cached_transcript_messages & cache_boundary
    else no path
        Transcript-->>Agent: none
    end

    Note over Agent,Provider: Provider interaction loop
    loop per provider iteration
        alt resumed & first iteration
            Agent->>Agent: consume cached prefix, append history delta
        else
            Agent->>Agent: build messages from full history
        end
        Agent->>Provider: send messages
        Provider-->>Agent: response + usage
        Agent->>Agent: accumulate usage & capture last-sent messages
    end

    Note over Agent,Transcript: Turn completion
    alt turn succeeded
        Agent->>Transcript: persist_session_transcript(path?, messages, aggregated_usage, meta)
        Transcript->>FS: create date dirs & write .md with metadata + <!--MSG--> blocks
        FS-->>Transcript: write success/failure
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat(agent): architecture improvements — context guard, cost tracking, permissions, events #151 — Related changes to UsageInfo fields and session-resume/usage propagation that this PR builds upon.
feat(context): global context management module #504 — Overlaps on Agent construction and session-related types; likely to intersect with the builder/type changes here.

Poem

🐰
With whiskers poised by midnight's pane,
I saved our chats in markdown grain,
Cached prefixes tucked in rows so neat,
Resume, persist—then hop a beat.
Hooray, small rabbit, tracks complete!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: session transcripts for LLM KV cache stability, which directly relates to the primary objective of enabling KV cache reuse through byte-identical message persistence.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

src/openhuman/providers/compatible.rs (1)
1066-1099: Add debug trace for usage-source selection in extract_usage.

This branch is critical for token accounting and billing diagnostics; please add a structured debug log indicating whether values came from openhuman or standard usage and whether billing metadata was present (without sensitive payload logging).
As per coding guidelines: src/**/*.rs: Add substantial, development-oriented logs on new/changed flows; include logs at branch decisions and error handling, with structured stable prefixes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/providers/compatible.rs` around lines 1066 - 1099, In
extract_usage, add a structured debug trace that logs which source provided
token counts (openhuman vs standard usage) and whether billing metadata was
present: inspect oh, std_usage, oh_usage, and oh_billing and emit a
development-level logger call (with a stable prefix) that records booleans like
from_openhuman, from_standard, and has_billing and the numeric token values
(input_tokens, output_tokens, cached_input_tokens) but no sensitive payloads;
ensure the log is placed after computing the token variables and before
returning the ProviderUsageInfo so the instrumentation reflects the branch
decision used to populate ProviderUsageInfo.
src/openhuman/agent/harness/subagent_runner.rs (1)
432-444: Sub-agent transcripts lack usage metrics.

The token and cost fields are hardcoded to zero, whereas the main agent's turn.rs accumulates actual usage from provider responses. For consistency and debugging value, consider threading usage info from run_inner_loop to the transcript. Currently run_inner_loop discards the ChatResponse.usage data.

This is acceptable for the initial implementation since transcripts are primarily for KV cache prefix replay, and sub-agent usage is typically a small fraction of total cost.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/subagent_runner.rs` around lines 432 - 444,
TranscriptMeta is being populated with zeroed usage and cost fields; update the
pipeline to thread actual usage from run_inner_loop into the transcript: modify
run_inner_loop (and its callers) to return or expose the ChatResponse.usage (or
an aggregated Usage struct) instead of discarding it, sum input/output/token
usage and charged amount across sub-agent provider responses, and then set
TranscriptMeta.input_tokens, output_tokens, cached_input_tokens (if applicable),
and charged_amount_usd from that aggregated Usage when constructing
TranscriptMeta in subagent_runner.rs so transcripts reflect real usage for
debugging and accounting.
src/openhuman/agent/harness/session/transcript.rs (1)
207-213: Edge case in escape/unescape.

If content contains the literal escape sequence , it will be corrupted to  on round-trip. This is extremely unlikely in practice (LLM conversations won't contain this exact byte sequence), but worth documenting in a code comment for future maintainers.
📝 Suggested documentation
 fn escape_content(content: &str) -> String {
+    // Note: if content contains the literal escape sequence ``,
+    // it will be incorrectly unescaped on read. This is an accepted limitation
+    // for an extremely unlikely edge case.
     content.replace(MSG_CLOSE, MSG_CLOSE_ESCAPED)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/session/transcript.rs` around lines 207 - 213,
The current escape_content and unescape_content functions perform a simple
replace between MSG_CLOSE and MSG_CLOSE_ESCAPED but are not fully reversible if
the original content already contains the literal escape sequence
(), which will be transformed on round-trip; add a concise code
comment above the functions escape_content and unescape_content documenting this
edge case (mention the exact sequences MSG_CLOSE and MSG_CLOSE_ESCAPED and that
an input containing MSG_CLOSE_ESCAPED will be corrupted on unescape), and
optionally note that a more robust fix would be to use a non-ambiguous encoding
or a two-pass escaping scheme if full reversibility is later required.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/providers/compatible.rs`:
- Around line 1000-1008: The helper parse_native_response currently returns a
successful ProviderChatResponse with text None when api_response.choices is
empty; change parse_native_response to return a Result<ProviderChatResponse,
Error> (or existing error type) and propagate an Err when
choices.into_iter().next() is None so empty choices are treated as an error;
update callers (e.g., chat() and the other occurrence around
parse_native_response usage at the second site) to handle the Result (propagate
or map the error) to keep the contract consistent and avoid silently accepting
invalid payloads.
- Around line 333-343: OpenHumanUsage currently defines non-optional u64 token
fields (input_tokens/output_tokens/total_tokens/cached_input_tokens) with
#[serde(default)] which deserializes missing fields to 0 and causes
oh_usage.map(|u| u.input_tokens) to return Some(0) and block fallback; change
those fields to Option<u64> (e.g., input_tokens: Option<u64>) and update
extract_usage to use oh_usage.and_then(|u| u.input_tokens) / and_then for each
token so missing keys yield None and allow the fallback chain to proceed; also
add a debug/trace log in extract_usage noting which branch provided the usage
(OpenHuman vs standard source) when making the branch decision so flow changes
are observable.

---

Nitpick comments:
In `@src/openhuman/agent/harness/session/transcript.rs`:
- Around line 207-213: The current escape_content and unescape_content functions
perform a simple replace between MSG_CLOSE and MSG_CLOSE_ESCAPED but are not
fully reversible if the original content already contains the literal escape
sequence (<!--\/MSG-->), which will be transformed on round-trip; add a concise
code comment above the functions escape_content and unescape_content documenting
this edge case (mention the exact sequences MSG_CLOSE and MSG_CLOSE_ESCAPED and
that an input containing MSG_CLOSE_ESCAPED will be corrupted on unescape), and
optionally note that a more robust fix would be to use a non-ambiguous encoding
or a two-pass escaping scheme if full reversibility is later required.

In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 432-444: TranscriptMeta is being populated with zeroed usage and
cost fields; update the pipeline to thread actual usage from run_inner_loop into
the transcript: modify run_inner_loop (and its callers) to return or expose the
ChatResponse.usage (or an aggregated Usage struct) instead of discarding it, sum
input/output/token usage and charged amount across sub-agent provider responses,
and then set TranscriptMeta.input_tokens, output_tokens, cached_input_tokens (if
applicable), and charged_amount_usd from that aggregated Usage when constructing
TranscriptMeta in subagent_runner.rs so transcripts reflect real usage for
debugging and accounting.

In `@src/openhuman/providers/compatible.rs`:
- Around line 1066-1099: In extract_usage, add a structured debug trace that
logs which source provided token counts (openhuman vs standard usage) and
whether billing metadata was present: inspect oh, std_usage, oh_usage, and
oh_billing and emit a development-level logger call (with a stable prefix) that
records booleans like from_openhuman, from_standard, and has_billing and the
numeric token values (input_tokens, output_tokens, cached_input_tokens) but no
sensitive payloads; ensure the log is placed after computing the token variables
and before returning the ProviderUsageInfo so the instrumentation reflects the
branch decision used to populate ProviderUsageInfo.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f588128-4c71-45d8-8ba5-4b01eac28a71

📥 Commits

Reviewing files that changed from the base of the PR and between 1cb70e3 and 3b1ed77.

📒 Files selected for processing (11)

src/openhuman/agent/harness/session/builder.rs
src/openhuman/agent/harness/session/mod.rs
src/openhuman/agent/harness/session/transcript.rs
src/openhuman/agent/harness/session/turn.rs
src/openhuman/agent/harness/session/types.rs
src/openhuman/agent/harness/subagent_runner.rs
src/openhuman/context/guard.rs
src/openhuman/context/manager.rs
src/openhuman/context/pipeline.rs
src/openhuman/providers/compatible.rs
src/openhuman/providers/traits.rs

src/openhuman/providers/compatible.rs

…64> for OpenHumanUsage fields - parse_native_response now returns Result and propagates an error when choices is empty instead of silently returning text: None - OpenHumanUsage token fields changed to Option<u64> so missing keys yield None and allow fallback to standard OpenAI usage block - extract_usage now logs which source (openhuman vs standard) provided token counts via structured tracing - run_inner_loop returns AggregatedUsage so subagent transcripts reflect real usage instead of zeros - Document escape_content/unescape_content edge case with pre-existing escape sequences

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/harness/session/transcript.rs`:
- Around line 207-223: The current escape_content/unescape_content pair is not
bijective because a pre-existing literal MSG_CLOSE_ESCAPED becomes MSG_CLOSE
during unescape; fix by making the scheme reversible: in escape_content
(function escape_content) first escape backslashes (e.g., replace "\" with an
escaped marker) and then escape MSG_CLOSE (MSG_CLOSE -> MSG_CLOSE_ESCAPED), and
in unescape_content perform the exact reverse order (first unescape
MSG_CLOSE_ESCAPED -> MSG_CLOSE, then unescape the backslash marker -> "\"), or
alternatively switch to a length-delimited encoding (prefix content with its
byte length) inside escape_content and decode accordingly in unescape_content;
update references to MSG_CLOSE and MSG_CLOSE_ESCAPED and ensure both functions
are exact inverses so every input round-trips exactly.
- Around line 166-175: resolve_new_transcript_path currently races by returning
a suggested filename (max+1) without reserving it; change the allocation to
atomically reserve a new transcript file by attempting to create the file with
O_EXCL style semantics in a retry loop: in resolve_new_transcript_path (or a new
helper e.g., reserve_new_transcript_path) compute sanitized name and index, then
try to open/create the candidate path using create_new
(OpenOptions::new().write(true).create_new(true)) or equivalent, and if
create_new fails because the file exists, increment index and retry until
success (with a reasonable cap/backoff); return the successfully created PathBuf
so the later fs::write in subagent_runner (calls at lines referenced) can safely
write without clobbering another concurrent writer. Ensure errors propagate via
Result and clean up partially created files on failure if needed.
- Around line 268-320: The parser currently treats malformed message tags as
end-of-input by using `break`, causing truncated files to load silently; in
`parse_messages()` replace those `break` branches (specifically the `else {
break; }` after failing to find `MSG_OPEN_SUFFIX`/role, and the final
double-`find` that falls back to `break` when `MSG_CLOSE` isn't found) with hard
errors (e.g., `anyhow::bail!` or `Err(anyhow!(...))`) that include context (the
current `search_from`/content_start index, the expected token like
`MSG_OPEN_PREFIX`/`MSG_OPEN_SUFFIX`/`MSG_CLOSE`, and a short snippet of the
remaining `raw`), so callers get an explicit parse error instead of a partial
`Ok(messages)` return; keep the successful message pushes using
`unescape_content` and `ChatMessage` unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3d0eda17-61b5-4a1b-81b8-da8769149049

📥 Commits

Reviewing files that changed from the base of the PR and between 3b1ed77 and 1011fa6.

📒 Files selected for processing (3)

src/openhuman/agent/harness/session/transcript.rs
src/openhuman/agent/harness/subagent_runner.rs
src/openhuman/providers/compatible.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/openhuman/providers/compatible.rs

coderabbitai · 2026-04-12T07:39:30Z

src/openhuman/agent/harness/session/transcript.rs

+pub fn resolve_new_transcript_path(workspace_dir: &Path, agent_name: &str) -> Result<PathBuf> {
+    let date_dir = today_session_dir(workspace_dir);
+    fs::create_dir_all(&date_dir)
+        .with_context(|| format!("create session dir {}", date_dir.display()))?;
+
+    let sanitized = sanitize_agent_name(agent_name);
+    let next_index = next_index(&date_dir, &sanitized)?;
+    let filename = format!("{}_{}.md", sanitized, next_index);
+
+    Ok(date_dir.join(filename))


⚠️ Potential issue | 🟠 Major

Path allocation can collide under concurrent writes.

resolve_new_transcript_path() computes max + 1 and returns the filename without reserving it. The new sub-agent call sites in src/openhuman/agent/harness/subagent_runner.rs at Lines 302-308 and 392-398 can hit this concurrently for the same agent_id, choose the same {agent}_{index}.md, and let the later fs::write overwrite the first transcript. This needs an atomic reservation step (create_new + retry loop, or equivalent) in the allocation/write path.

Also applies to: 339-356

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/transcript.rs` around lines 166 - 175, resolve_new_transcript_path currently races by returning a suggested filename (max+1) without reserving it; change the allocation to atomically reserve a new transcript file by attempting to create the file with O_EXCL style semantics in a retry loop: in resolve_new_transcript_path (or a new helper e.g., reserve_new_transcript_path) compute sanitized name and index, then try to open/create the candidate path using create_new (OpenOptions::new().write(true).create_new(true)) or equivalent, and if create_new fails because the file exists, increment index and retry until success (with a reasonable cap/backoff); return the successfully created PathBuf so the later fs::write in subagent_runner (calls at lines referenced) can safely write without clobbering another concurrent writer. Ensure errors propagate via Result and clean up partially created files on failure if needed.

coderabbitai · 2026-04-12T07:39:30Z

src/openhuman/agent/harness/session/transcript.rs

+/// Escape the closing delimiter so it cannot appear literally in message
+/// content. Replaces `<!--/MSG-->` with `<!--\/MSG-->`.
+///
+/// **Known edge case:** if the original content already contains the literal
+/// escape sequence `<!--\/MSG-->`, `unescape_content` will convert it into
+/// `<!--/MSG-->`, corrupting the round-trip. In practice this sequence is
+/// vanishingly unlikely in LLM output. A fully reversible fix would require
+/// a two-pass escaping scheme (e.g. also escaping the backslash), but that
+/// added complexity is not warranted unless the edge case materialises.
+fn escape_content(content: &str) -> String {
+    content.replace(MSG_CLOSE, MSG_CLOSE_ESCAPED)
+}
+
+/// Reverse the escaping applied by [`escape_content`]. See that function's
+/// doc comment for the known edge case with pre-existing escape sequences.
+fn unescape_content(content: &str) -> String {
+    content.replace(MSG_CLOSE_ESCAPED, MSG_CLOSE)


⚠️ Potential issue | 🟠 Major

Make the content escaping bijective.

This escape/unescape pair still cannot round-trip every valid ChatMessage.content: a literal  is read back as . That breaks the core “exact bytes back out” guarantee for any prompt or tool output that already contains the escaped token. Please switch to a fully reversible encoding here (for example, a length-delimited body or another bijective escaping scheme).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/transcript.rs` around lines 207 - 223, The current escape_content/unescape_content pair is not bijective because a pre-existing literal MSG_CLOSE_ESCAPED becomes MSG_CLOSE during unescape; fix by making the scheme reversible: in escape_content (function escape_content) first escape backslashes (e.g., replace "\" with an escaped marker) and then escape MSG_CLOSE (MSG_CLOSE -> MSG_CLOSE_ESCAPED), and in unescape_content perform the exact reverse order (first unescape MSG_CLOSE_ESCAPED -> MSG_CLOSE, then unescape the backslash marker -> "\"), or alternatively switch to a length-delimited encoding (prefix content with its byte length) inside escape_content and decode accordingly in unescape_content; update references to MSG_CLOSE and MSG_CLOSE_ESCAPED and ensure both functions are exact inverses so every input round-trips exactly.

coderabbitai · 2026-04-12T07:39:30Z

src/openhuman/agent/harness/session/transcript.rs

+fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> {
+    let mut messages = Vec::new();
+    let mut search_from = 0;
+
+    loop {
+        // Find next <!--MSG role="..."--> opening tag
+        let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else {
+            break;
+        };
+        let open_start = search_from + open_start;
+        let after_prefix = open_start + MSG_OPEN_PREFIX.len();
+
+        // Extract role from between the quotes
+        let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else {
+            break;
+        };
+        let role = raw[after_prefix..after_prefix + role_end].to_string();
+
+        // Content starts after the opening tag + newline
+        let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len();
+        let content_start = if raw[content_start..].starts_with('\n') {
+            content_start + 1
+        } else {
+            content_start
+        };
+
+        // Find closing tag
+        let close_tag = format!("\n{MSG_CLOSE}");
+        let Some(content_end_rel) = raw[content_start..].find(&close_tag) else {
+            // Try without leading newline for empty content
+            let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else {
+                break;
+            };
+            let content = &raw[content_start..content_start + content_end_rel];
+            messages.push(ChatMessage {
+                role,
+                content: unescape_content(content),
+            });
+            search_from = content_start + content_end_rel + MSG_CLOSE.len();
+            continue;
+        };
+
+        let content = &raw[content_start..content_start + content_end_rel];
+        messages.push(ChatMessage {
+            role,
+            content: unescape_content(content),
+        });
+
+        search_from = content_start + content_end_rel + close_tag.len();
+    }
+
+    Ok(messages)
+}


⚠️ Potential issue | 🟠 Major

Malformed transcripts currently load as partial history.

The error branches in parse_messages() break and then return Ok(messages), so a truncated or hand-edited file is treated as a valid-but-shorter conversation. For resume/cache reuse that is worse than failing fast, because it silently changes the prefix sent back to the model. Please turn these malformed-tag paths into hard parse errors with context.

Suggested direction

- let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else { - break; - }; + let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else { + anyhow::bail!( + "malformed transcript: unterminated role header at byte {}", + open_start + ); + }; ... - let Some(content_end_rel) = raw[content_start..].find(&close_tag) else { + let Some(content_end_rel) = raw[content_start..].find(&close_tag) else { // Try without leading newline for empty content - let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else { - break; - }; + let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else { + anyhow::bail!( + "malformed transcript: missing closing tag for role '{}'", + role + ); + };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> {

let mut messages = Vec::new();

let mut search_from = 0;

loop {

// Find next  opening tag

let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else {

break;

};

let open_start = search_from + open_start;

let after_prefix = open_start + MSG_OPEN_PREFIX.len();

// Extract role from between the quotes

let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else {

break;

};

let role = raw[after_prefix..after_prefix + role_end].to_string();

// Content starts after the opening tag + newline

let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len();

let content_start = if raw[content_start..].starts_with('\n') {

content_start + 1

} else {

content_start

};

// Find closing tag

let close_tag = format!("\n{MSG_CLOSE}");

let Some(content_end_rel) = raw[content_start..].find(&close_tag) else {

// Try without leading newline for empty content

let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else {

break;

};

let content = &raw[content_start..content_start + content_end_rel];

messages.push(ChatMessage {

role,

content: unescape_content(content),

});

search_from = content_start + content_end_rel + MSG_CLOSE.len();

continue;

};

let content = &raw[content_start..content_start + content_end_rel];

messages.push(ChatMessage {

role,

content: unescape_content(content),

});

search_from = content_start + content_end_rel + close_tag.len();

}

Ok(messages)

}

fn parse_messages(raw: &str) -> Result<Vec<ChatMessage>> {

let mut messages = Vec::new();

let mut search_from = 0;

loop {

// Find next  opening tag

let Some(open_start) = raw[search_from..].find(MSG_OPEN_PREFIX) else {

break;

};

let open_start = search_from + open_start;

let after_prefix = open_start + MSG_OPEN_PREFIX.len();

// Extract role from between the quotes

let Some(role_end) = raw[after_prefix..].find(MSG_OPEN_SUFFIX) else {

anyhow::bail!(

"malformed transcript: unterminated role header at byte {}",

open_start

);

};

let role = raw[after_prefix..after_prefix + role_end].to_string();

// Content starts after the opening tag + newline

let content_start = after_prefix + role_end + MSG_OPEN_SUFFIX.len();

let content_start = if raw[content_start..].starts_with('\n') {

content_start + 1

} else {

content_start

};

// Find closing tag

let close_tag = format!("\n{MSG_CLOSE}");

let Some(content_end_rel) = raw[content_start..].find(&close_tag) else {

// Try without leading newline for empty content

let Some(content_end_rel) = raw[content_start..].find(MSG_CLOSE) else {

anyhow::bail!(

"malformed transcript: missing closing tag for role '{}'",

role

);

};

let content = &raw[content_start..content_start + content_end_rel];

messages.push(ChatMessage {

role,

content: unescape_content(content),

});

search_from = content_start + content_end_rel + MSG_CLOSE.len();

continue;

};

let content = &raw[content_start..content_start + content_end_rel];

messages.push(ChatMessage {

role,

content: unescape_content(content),

});

search_from = content_start + content_end_rel + close_tag.len();

}

Ok(messages)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/transcript.rs` around lines 268 - 320, The parser currently treats malformed message tags as end-of-input by using `break`, causing truncated files to load silently; in `parse_messages()` replace those `break` branches (specifically the `else { break; }` after failing to find `MSG_OPEN_SUFFIX`/role, and the final double-`find` that falls back to `break` when `MSG_CLOSE` isn't found) with hard errors (e.g., `anyhow::bail!` or `Err(anyhow!(...))`) that include context (the current `search_from`/content_start index, the expected token like `MSG_OPEN_PREFIX`/`MSG_OPEN_SUFFIX`/`MSG_CLOSE`, and a short snippet of the remaining `raw`), so callers get an explicit parse error instead of a partial `Ok(messages)` return; keep the successful message pushes using `unescape_content` and `ChatMessage` unchanged.

senamakel added 5 commits April 11, 2026 23:31

coderabbitai bot reviewed Apr 12, 2026

View reviewed changes

src/openhuman/providers/compatible.rs Show resolved Hide resolved

src/openhuman/providers/compatible.rs Outdated Show resolved Hide resolved

coderabbitai bot reviewed Apr 12, 2026

View reviewed changes

senamakel merged commit ace0006 into tinyhumansai:main Apr 12, 2026
9 checks passed

coderabbitai bot mentioned this pull request Apr 12, 2026

feat: real-time inference progress events for web channel #514

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: session transcripts for LLM KV cache stability#512

feat: session transcripts for LLM KV cache stability#512
senamakel merged 6 commits intotinyhumansai:mainfrom
senamakel:feat/llm-kv-cache

senamakel commented Apr 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 12, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 12, 2026

Uh oh!

coderabbitai bot Apr 12, 2026

Uh oh!

coderabbitai bot Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented Apr 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key files

Example transcript header

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented Apr 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 12, 2026 •

edited

Loading